Abstracts PLS Generalised Linear Regression: foundations and recent advances with variable selection and validation procedures

نویسندگان

  • Afshin R. Khorrami
  • Vincenzo Esposito
چکیده

PLS Generalised Linear Regression: foundations and recent advances with variable selection and validation procedures Vincenzo Esposito Vinzi Universita degli Studi di Napoli; [email protected] PLS (Partial Least Squares) univariate regression (PLS1) is a model linking a numerical dependent variable to a set of numerical (or dummy) explanatory variables especially feasible in situations where multiple regression is unstable or not feasible at all (strong multicollinearity, small number of observations compared to the number of variables, missing data). The same kind of problems may be encountered also in classical logistic regression and, more generally, when using a generalised linear model. It is possible to apply the same principles of PLS regression to logistic regression as well as to generalised linear models. PLS1 can be actually obtained by means of an iterated use of simple and multiple regressions based on ordinary least squares (OLS). By taking advantage from the statistical tests associated with linear regression, it is feasible to select the significant explanatory variables to include in PLS regression and to choose the number of PLS components to retain. The principle of the presented algorithm may be similarly used in order to yield an extension of PLS regression to PLS generalised linear regression (PLS-GLR). PLS generalised linear regression retains the rationale of PLS while the criterion optimised at each step is based on maximum likelihood. Nevertheless, the acronym PLS is kept as a reference to a general methodology for dealing with a set of predictors. The approach proposed for PLS-GLR is simple and easy to implement. Moreover, it can be easily generalised to any model that is linear at the level of the explanatory variables. Some examples show the use of the proposed methods in real practice with specific reference to classical PLS regression, PLS logistic regression and the application of PLS-GLR. PLS and Sensory Analysis M. Tenenhaus, J. Pages, L. Ambroisine, C. Guinot HEC School of Management-Paris, France; ENSA-INSFA, France; CE.R.I.E.S., France [email protected] In this paper a new methodology devoted to the analysis of a type of data often encountered in sensory analysis is described. We are interested in daily used products like orange juices, yogurts, lipsticks. A small number of products are described by physicochemical and sensory characteristics. Moreover these products are evaluated by consumers on a preference scale. The objective of the statistical analysis is to relate the hedonic block of variables Y (the consumers’ preferences) to the physico-chemical block X1 and to the sensory block X2. Generally the data table to be analyzed has about 10 rows, about 30 predictors X, and about 100 responses Y. Some data can also be missing. PLS methods are perfectly suitable for this type of problem. PLS regression allows the analysis of the link between the responses Y and the predictors X = [X1, X2]. Using this method it is possible to cluster the consumers in homogeneous groups with respect to their tastes and such that their behavior can be related to the characteristics of the products. For each homogeneous group of consumers PLS regression allows to obtain a graphical display of the products with their characteristics and a mapping of the consumers based on their preferences, or contour lines on product preferences. Statistical validation can be obtained by Jack-knife. PLS path modeling allows a more detailed analysis of each homogeneous group of consumers by building a causal scheme: each homogeneous block of consumers is related to the physico-chemical block X1 and the sensory block X2, and the sensory block is itself related to the physico-chemical block. Statistical validation is carried out by Bootstrap. PLS methods allow to obtain more complete and of easier understanding than usual methods: PLS methods fit sensory analysis like hand in glove.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PLS generalised linear regression

PLS univariate regression is a model linking a dependent variable y to a set X= {x1; : : : ; xp} of (numerical or categorical) explanatory variables. It can be obtained as a series of simple and multiple regressions. By taking advantage from the statistical tests associated with linear regression, it is feasible to select the signi6cant explanatory variables to include in PLS regression and to ...

متن کامل

Sorting variables by using informative vectors as a strategy for feature selection in multivariate regression

J. Chemom A new procedure with high ability to enhance prediction of multivariate calibration models with a small number of interpretable variables is presented. The core of this methodology is to sort the variables from an informative vector, followed by a systematic investigation of PLS regression models with the aim of finding the most relevant set of variables by comparing the cross-validat...

متن کامل

Multivariate linear QSPR/QSAR models: Rigorous evaluation of variable selection for PLS

Basic chemometric methods for making empirical regression models for QSPR/QSAR are briefly described from a user's point of view. Emphasis is given to PLS regression, simple variable selection and a careful and cautious evaluation of the performance of PLS models by repeated double cross validation (rdCV). A demonstration example is worked out for QSPR models that predict gas chromatographic re...

متن کامل

Important Molecular Descriptors Selection Using Self Tuned Reweighted Sampling Method for Prediction of Antituberculosis Activity

In this paper, a new descriptor selection method for selecting an optimal combination of important descriptors of sulfonamide derivatives data, named self tuned reweighted sampling (STRS), is developed. descriptors are defined as the descriptors with large absolute coefficients in a multivariate linear regression model such as partial least squares(PLS). In this study , the absolute values of r...

متن کامل

Partial least squares regression and projection on latent structure regression (PLS Regression)

Partial least squares (pls) regression (a.k.a projection on latent structures) is a recent technique that combines features from and generalizes principal component analysis (pca) and multiple linear regression. Its goal is to predict a set of dependent variables from a set of independent variables or predictors. This prediction is achieved by extracting from the predictors a set of orthogonal ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004